An improved minimum generation error based model adaptation for HMM-based speech synthesis
نویسندگان
چکیده
Aminimum generation error (MGE) criterion had been proposed for model training in HMM-based speech synthesis. In this paper, we apply the MGE criterion to model adaptation for HMM-based speech synthesis, and introduce an MGE linear regression (MGELR) based model adaptation algorithm, where the regression matrices used to transform source models are optimized so as to minimize the generation errors of adaptation data. In addition, we incorporate the recent improvements of MGE criterion into MGELR-based model adaptation, including state alignment under MGE criterion and using a log spectral distortion (LSD) instead of Euclidean distance for spectral distortion measure. From the experimental results, the adaptation performance was improved after incorporating these two techniques, and the formal listening tests showed that the quality and speaker similarity of synthesized speech after MGELRbased adaptation were significantly improved over the original MLLR-based adaptation.
منابع مشابه
Overview of NIT HMM - based speech synthesis system for Blizzard Challenge 2010
This paper describes a hidden Markov model (HMM)-based speech synthesis system developed for the Blizzard Challenge 2010. This system employs STRAIGHT vocoding, minimum generation error (MGE) training, minimum generation error linear regression (MGELR) based model adaptation, the Bayesian speech synthesis framework, and the parameter generation algorithm considering global variance. The real-ti...
متن کاملMinimum generation error criterion for tree-based clustering of context dependent HMMs
Due to the inconsistency between HMM training and synthesis application in HMM-based speech synthesis, the minimum generation error (MGE) criterion had been proposed for HMM training. This paper continues to apply the MGE criterion for tree-based clustering of context dependent HMMs. As directly applying the MGE criterion results in an unacceptable computational cost, the parameter updating rul...
متن کاملMinimum generation error training with direct log spectral distortion on LSPs for HMM-based speech synthesis
A minimum generation error (MGE) criterion had been proposed to solve the issues related to maximum likelihood (ML) based HMM training in HMM-based speech synthesis. In this paper, we improve the MGE criterion by imposing a log spectral distortion (LSD) instead of the Euclidean distance to define the generation error between the original and generated line spectral pair (LSP) coefficients. More...
متن کاملCross-Validation and Minimum Generation Error based Decision Tree Pruning for HMM-based Speech Synthesis
This paper presents a decision tree pruning method for the model clustering of HMM-based parametric speech synthesis by cross-validation (CV) under the minimum generation error (MGE) criterion. Decision-tree-based model clustering is an important component in the training process of an HMM based speech synthesis system. Conventionally, the maximum likelihood (ML) criterion is employed to choose...
متن کاملUSTC System for Blizzard Challenge 2006 an Improved HMM-based Speech Synthesis Method
This paper introduces the USTC speech synthesis system for Blizzard Challenge 2006. The HMM-based parametric synthesis approach was adopted for its convenience and effectiveness in building a new voice, especially for the nonnative developers. Some useful techniques were also integrated into our system, such as minimum generation error (MGE) training, phone duration modeling and linear spectral...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009